如題,小弟爬蟲初學,想練使用者登入
登入網站:it邦幫忙
有在登入頁面抓取token和cookie了,但依然返回419
程式碼如下,麻煩各位大神幫忙,請多多指教 謝謝
import requests
from bs4 import BeautifulSoup
from urllib import request,parse
from http.cookiejar import CookieJar
import urllib
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
response = urllib.request.urlopen('https://www.python.org')
response.read().decode('utf-8')
headers = {
'User-Agent': 'my user-agent',
}
#從登入頁抓取token
session = requests.Session()
url = 'https://member.ithome.com.tw/login'
response = session.get(url,headers = headers)
soup = BeautifulSoup(response.text,'html5lib')
token = soup.find('input',{'name':"_token"})['value']
data = {
'account' : 'myaccount',
'password': 'mypassword',
'_token':str(token),
'_token':str(token),
}
#抓取cookies
cookiejar = CookieJar()
handler = request.HTTPCookieProcessor(cookiejar)
opener = request.build_opener(handler)
cookies = {}
resp = opener.open('https://member.ithome.com.tw/login')
for c in (list(cookiejar)):
cookies[c.name] = c.value
headers['Cookie']= f'_ga=GA1.3.249018218.1653631398; _gid=GA1.3.1048506314.1681834763; XSRF-TOKEN={cookies["XSRF-TOKEN"]}; ithomemembercenter_session={cookies["ithomemembercenter_session"]}'
resp = session.post(url, data = data ,headers = headers)
print(resp.status_code)
這幾行註解起來,看起來就200了
for c in (list(cookiejar)):
cookies[c.name] = c.value
headers['Cookie']= f'_ga=GA1.3.249018218.1653631398; _gid=GA1.3.1048506314.1681834763; XSRF-TOKEN={cookies["XSRF-TOKEN"]}; ithomemembercenter_session={cookies["ithomemembercenter_session"]}'
多查幾個範例,理解每一行程式用途,才能更好解決問題..